User-friendly biplots in R
Centre for Multi-Dimensional Data Visualisation (MuViSU)
muvisu@sun.ac.za
SASA 2024
The biplotEZ package aims to provide users with EZier software to construct biplots.
What is a biplot?
Visualisation of multi-dimensional data in 2 or 3 dimensions.
A brief history of biplots and biplotEZ.
1971
Gabriel, K.R., The biplot graphic display of matrices with application to principal component analysis. Biometrika, 58(3), pp.453-467.
1976
Prof Niël J le Roux presents a seminar on biplots.

1996
John Gower publish Biplots with David Hand.

Prof le Roux introduces a Masters module on Biplots (Multidimensional scaling).
Rika Cilliers obtains her Masters on biplots for socio-economic progress under Prof le Roux.
1997
SASA conference paper: S-PLUS FUNCTIONS FOR INTERACTIVE LINEAR AND NON-LINEAR BIPLOTS by SP van Blerk, NJ le Roux & S Gardner.
2001
Sugnet Gardner (Lubbe) obtains her PhD on biplots under Prof le Roux.

Louise Wood obtains her Masters on biplots for socio-economic development under Prof le Roux.
2003: Adele Bothma obtains her Masters on biplots for school results under Prof le Roux.
2007: Idele Walters obtains her Masters on biplots for exploring the gender gap under Prof le Roux.
2008: Ryan Wedlake obtains his Masters on robust biplots under Prof le Roux.
2009
BiplotGUI for Interactive Biplots, Anthony le Grange.
2010: André Mostert obtains his Masters on biplots in industry under Prof le Roux.
2011
John Gower, Sugnet Lubbe and Niël le Roux publish Understanding Biplots.

R package UBbipl developed with the book, but never published.
2013: Hilmarie Brand obtains her Masters on PCA and CVA biplots under Prof le Roux.
2014: Opeoluwe Oyedele obtains her PhD on Partial Least Squares biplots under Sugnet Lubbe.
2015: Ruan Rossouw obtains his PhD on using biplots for multivariate process monitoring under Prof le Roux.
2016: Ben Gurr obtains his Masters on biplots for crime data under Prof le Roux.
2019
Johané Nienkemper-Swanepoel obtains her PhD on MCA biplots under Prof le Roux and Sugnet Lubbe.

Carel van der Merwe obtains his PhD using biplots. Carel supervises 4 Master’s projects on biplots.
2020
Raeesa Ganey obtains her PhD on Principal Surface Biplots under Sugnet Lubbe.

André Mostert obtains his PhD on multidimensional scaling for identification of contributions to out of control multivariate processes under Sugnet Lubbe.
Adriaan Rowen obtains his Masters using biplots to understand black-box machine learning models.
2022
Zoë-Mae Adams obtains her Masters on biplots in sentiment classification under Johané Nienkemper-Swanepoel.

2023
bipl5 for Exploding Biplots, Ruan Buys.
2024
Ruan Buys obtains his Masters on Exploding biplots under Carel van der Merwe.

Adriaan Rowen to submit his PhD using biplots to understand black-box machine learning models.
Peter Manefeldt to submit his Masters using multidimensional scaling for interpretability of random forest models.








Canonical space of dimension 1.
Solve \(\mathbf{BM=WM\Lambda}\) where \(\mathbf{M} = \begin{bmatrix} \mathbf{m}_1 & \mathbf{M}_2\\ \end{bmatrix}\)
\[ \bar{\mathbf{Y}} = \bar{\mathbf{X}} \mathbf{M} = \begin{bmatrix} \bar{y}_{11} & 0 & \dots & 0 \\ \vdots & \vdots & & \vdots\\ \bar{y}_{K1} & 0 & \dots & 0 \\ \end{bmatrix} \]
\[ \mathbf{\Lambda} = diag(\lambda, 0, ..., 0) \] Total squared reconstruction error for means: \(TSREM = tr\{ (\bar{\mathbf{X}}-\hat{\bar{\mathbf{X}}})(\bar{\mathbf{X}}-\hat{\bar{\mathbf{X}}})'\} = 0\)
Total squared reconstruction error for samples: \(TSRES = tr\{ ({\mathbf{X}}-\hat{{\mathbf{X}}})({\mathbf{X}}-\hat{{\mathbf{X}}})'\} >0\)
Minimise \(TSRES\) (Default option)
Alternative option: Maximise Bhattacharyya distance. For more details see
Solve \(\mathbf{BM=WM\Lambda}\) where \(\mathbf{M} = \begin{bmatrix} \mathbf{m}_1 & \mathbf{M}_2\\ \end{bmatrix}\)
Minimise \(TSRES\)
\[ \mathbf{M}^{-1} = \begin{bmatrix} \mathbf{m}^{(1)} \\ \mathbf{M}^{(2)}\\ \end{bmatrix} \]
\[ \mathbf{M}^{(2)}\mathbf{M}^{(2)'} = \mathbf{UDV}' \]
\[ \mathbf{M}_{opt} = \begin{bmatrix} \mathbf{m}_1 & \mathbf{M}_2\mathbf{V}\\ \end{bmatrix} \]
Any 2D representation of sample points, for example
# Initial stress : 0.01116
# stress after 10 iters: 0.00833, magic = 0.018
# stress after 20 iters: 0.00614, magic = 0.213
# stress after 30 iters: 0.00561, magic = 0.500
# stress after 40 iters: 0.00558, magic = 0.500
To create a biplot we need to add information on the variables.
\[ \mathbf{X}:n \times p \]
\[ \mathbf{Z}:n \times 2 \]
\[ \mathbf{X = ZB + E} \]
\[ \mathbf{B = (X'X)}^{-1}\mathbf{X'Z} \]
Are linear axes a good representation when the transformation from \(\mathbf{X}:n \times p\) to \(\mathbf{Z}:n \times 2\) is nonlinear?
Replace linear regression with splines.
# Calculating spline axis for variable 1
# Calculating spline axis for variable 2
# Calculating spline axis for variable 3
# Calculating spline axis for variable 4